62 research outputs found

    Penggabungan Keputusan Pada Klasifikasi Multi-label

    Get PDF
    Klasifikasi adalah bagian dari sistem pembelajar yang fokus pada pemahaman pola melalui representasi dan generalisasi data. Penentuan prediksi hasil klasifikasi terbaik menjadi masalah jika terdapat beberapa masukan dari metode yang berbeda-beda pada lingkungan data yang heterogen. Penggabungan keputusan dapat digunakan untuk menentukan rekomendasi keluaran beberapa metode klasifikasi. Kami memilih pendekatan voting dan meta-learning sebagai metode penggabungan keputusan. Ada dua fase yang dilakukan pada penelitian ini, yaitu fase pembangunan prediksi oleh metode klasifikasi yang heterogen dan fase penggabungan rekomendasi metode-metode tersebut menjadi satu kesimpulan jawaban. Karakteristik klasifikasi yang menjadi fokus adalah klasifikasi multi-label. Binary Relevance (BR), Classifier Chains (CC), Hierarchichal of Multi-label Classifier (HOMER), dan Multi-label k Nearest Neighbors (MLkNN) adalah metode klasifikasi yang digunakan sebagai penyedia rekomendasi prediksi melalui pendekatan yang berbeda-beda. Pada fase penggabungan keputusan, metode Ignore diajukan sebagai pendekatan meta-learning. Ignore menggabungkan keputusan dengan cara mempelajari pola masukan dari sistem pembelajar. Untuk membandingkan kinerja Ignore, metode konsensus digunakan sebagai pendekatan voting. Hasil akhir menunjukkan bahwa Ignore memberikan hasil terbaik untuk parameter recall. Ignore memprediksi nilai false negative lebih sedikit dibandingkan dengan metode konsensus 0,5 dan 0,75. Hasil studi ini menunjukkan bahwa Ignore dapat digunakan sebagai meta-learning, meskipun kinerja Ignore harus diperbaiki agar dapat beradaptasi dengan data yang heterogen

    Web Services Discovery and Recommendation Based on Information Extraction and Symbolic Reputation

    Full text link
    This paper shows that the problem of web services representation is crucial and analyzes the various factors that influence on it. It presents the traditional representation of web services considering traditional textual descriptions based on the information contained in WSDL files. Unfortunately, textual web services descriptions are dirty and need significant cleaning to keep only useful information. To deal with this problem, we introduce rules based text tagging method, which allows filtering web service description to keep only significant information. A new representation based on such filtered data is then introduced. Many web services have empty descriptions. Also, we consider web services representations based on the WSDL file structure (types, attributes, etc.). Alternatively, we introduce a new representation called symbolic reputation, which is computed from relationships between web services. The impact of the use of these representations on web service discovery and recommendation is studied and discussed in the experimentation using real world web services

    Expansion de requêtes à base de motifs et de Word Embeddings pour améliorer la recherche de microblogs

    Get PDF
    International audienceSocial microblogging services have an especially significant role in our society. Twitter is one of the most popular microblogging sites used by people to find relevant information (e.g., breaking news, popular trends, information about people of interest, etc). In this context, retrieving information from such data has recently gained growing attention and opening new challenges. However, the size of such data and queries is usually short and may impact the search result. Query Expansion (QE) has the main task in this issue. In fact, words can have different meanings where only one is used for a given context. In this paper, we propose a QE method by considering the meaning of the context. Thus, we use patterns and Word Embeddings to expand users' queries. We experiment and evaluate the proposed method on the TREC dataset. Results show the effectiveness of the proposed approach and signify the combination of patterns and word embedding for enhanced microblog retrieval.Les services sociaux de microblogging jouent un rôle important dans notre société. Twitter est l'une des plateformes de microblogging les plus populaires, utilisées par les internautes pour trouver des informations pertinentes (sujets d'actualité, tendances populaires, informations sur certains internautes, etc.). Dans ce contexte, la recherche d'information provenant de telles données a récemment gagné un intérêt majeur et ouvert de nouveaux défis. Cependant, la taille de ces données ainsi que des requêtes est généralement courte et peut avoir un impact sur le résultat de la recherche. Cette dernière peut être améliorée à l'aide de l'expansion de requêtes. En effet, les mots peuvent avoir plusieurs sens dont un seul est utilisé pour un contexte donné. Dans cet article, nous proposons une méthode d'expansion de requêtes prenant en compte le sens du contexte. Nous utilisons les motifs et les plongements de mots pour étendre les requêtes des utilisateurs. L'évaluation expérimentale de la méthode proposée est menée sur la collection TREC. Les résultats montrent l'efficacité de l'approche en combinant des motifs avec des plongements de mots pour améliorer significativement la recherche de microblog

    OLGA SÁNCHEZ RODRÍGUEZ [Material gráfico]

    Get PDF
    ÁLBUM FAMILIAR CASA DE COLÓNCopia digital. Madrid : Ministerio de Educación, Cultura y Deporte. Subdirección General de Coordinación Bibliotecaria, 201

    An Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph

    Full text link
    Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of Internet traffic. P2P systems have emerged as an accepted way to share enormous volumes of data. Needs for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer is a database management system in itself, ex-posing its own schema. In such a setting, the main objective is the efficient search across peer databases by processing each incoming query without overly consuming bandwidth. The usability of these systems depends on successful techniques to find and retrieve data; however, efficient and effective routing of content-based queries is an emerging problem in P2P networks. This work was attended as an attempt to motivate the use of mining algorithms in the P2P context may improve the significantly the efficiency of such methods. Our proposed method based respectively on combination of clustering with hypergraphs. We use ECCLAT to build approximate clustering and discovering meaningful clusters with slight overlapping. We use an algorithm MTMINER to extract all minimal transversals of a hypergraph (clusters) for query routing. The set of clusters improves the robustness in queries routing mechanism and scalability in P2P Network. We compare the performance of our method with the baseline one considering the queries routing problem. Our experimental results prove that our proposed methods generate impressive levels of performance and scalability with with respect to important criteria such as response time, precision and recall.Comment: 2o pages, 8 figure

    Conférence Internationale Francophone sur la Science des Données (CIFSD) Actes de la 9e édition

    No full text
    International audienceLes actes de la 9e édition de la Conférence Internationale Francophone sur la Science des Données (CIFSD, https://cifsd-2021.sciencesconf.org) regroupe l'ensemble des contributions présentées à la conférence entre le 9 et le 11 juin 2021. Cette édition a été organisée par Aix-Marseille Université et le Laboratoire d'Informatique et Systèmes (LIS UMR 7020). En raison de la situation sanitaire, elle s'est déroulée en distanciel depuis Marseille (France). La thématique mise en avant pour cette édition a été la science de données pour la santé
    corecore